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ABSTRACT 

We introduce redMaGiC, an automated algorithm for selecting Luminous Red Galax¬ 
ies (LRGs). The algorithm was specifically developed to minimize photometric redshift 
uncertainties in photometric large-scale structure studies. redMaGiC achieves this by 
self-training the color-cuts necessary to produce a luminosity-thresholded LRG sam¬ 
ple of constant comoving density. We demonstrate that redMaGiC photo-zs are very 
nearly as accurate as the best machine-learning based methods, yet they require mini¬ 
mal spectroscopic training, do not suffer from extrapolation biases, and are very nearly 
Gaussian. We apply our algorithm to Dark Energy Survey (DES) Science Verification 
(SV) data to produce a redMaGiC catalog sampling the redshift range z S [0.2,0.8]. 
Our fiducial sample has a comoving space density of 10“^ (/i“^Mpc)“^, and a median 
photo- 2 ; bias (zgpec — -Zphoto) and scatter (crz/(l -|- z)) of 0.005 and 0.017 respectively. 
The corresponding 5a outlier fraction is 1.4%. We also test our algorithm with Sloan 
Digital Sky Survey (SDSS) Data Release 8 (DR8) and Stripe 82 data, and discuss how 
spectroscopic training can be used to control photo-z biases at the 0.1% level. 

1 INTRODUCTION 


Since the beginning of the Sloan Digital Sky Survey (SDSS; 
York et al. 20001, it has been recognized that luminous 
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red galaxies (LRGs) are an ideal probe of large-scale struc¬ 
ture! Stoughton et al. [2002 1. Being luminous, they can be 
observed to high redshift with relatively shallow exposures. 
In addition, the 4000 A break in the spectra of these galax¬ 
ies enables robust photometric redshift estimates (photo-zs) 
when the break is photometrically sampled. To date, red 
galaxy selection algorithms have been fairly crude: one typ¬ 
ically defines a color box that isolates LRGs in color-color 
space, with the specific cuts being selected in a relatively 
ad-hoc manner (e.g. Eisenstein et al.||2001 2005 I. This rel¬ 
ative lack of attention is driven by the fact that spectro¬ 
scopic follow-up renders high precision selection of LRGs 
unnecessary. With the advent of photometric surveys with 


no spectroscopic component like the DES (The Dark Energy 
[Survey Gollaboration||2005[) and the Large Synoptic Survey 


Telescope (LSST; LSST Science Collaboration. ||2009 1, it is 
now important to develop selection algorithms designed to 
minimize photometric redshift uncertainties. 

To this end, we have developed redMaGiC, a new red- 
galaxy selection algorithm. Specifically, our primary motiva¬ 
tion is to select galaxies with robust, exquisitely controlled 
photometric redshifts. A secondary and complementary, goal 
is to develop a new photometric redshift estimator for these 
galaxies that is well understood, and has spectroscopic re¬ 
quirements that are either easily met with existing facilities. 
The algorithm relies heavily on the infrastructure built for 


red sequence cluster finding with redMaPPer (Rykoff et al. 


2014 henceforth RMl). Specifically, redMaPPer combines 


sparse spectroscopy of galaxy clusters with photometric data 
to calibrate the red sequence of galaxies as a function of 
redshift. We use the resulting calibration as a photometric 
template, and select a galaxy as red if this empirical tem¬ 
plate provides a good description of the galaxy’s color. We 
refer to the resulting galaxy catalog as the red-sequence 
Matched-filter Galaxy Catalog, or redMaGiG for short. 

We implement our algorithm in the DES Science Veri¬ 
fication (SV) data (Rykoff et ah, in prep) and characterize 
the photo-z properties of the resulting catalog. To provide 
further photo -2 testing, we have also applied redMaGiG to 
SDSS DR8 and SDSS Stripe 82 data. 

The layout of the paper is as follows. Section briefly 
summarizes the data sets used in this work. Section [3] de¬ 
scribed the redMaGiG selection algorithm and the red¬ 
MaGiG photo -2 estimator. Section evaluates the perfor¬ 
mance of redMaGiG in each of the three data sets considered 
in this work, while section [^compares the redMaGiG photo- 
2 performance to several other photo -2 methods. Section 
demonstrates that redMaGiG succeeds at selecting galax¬ 
ies with clean photo- 2 S by comparing redMaGiG galaxies 
to the SDSS “constant mass” GMASS sample, which was 
specifically tailored for spectroscopic follow-up of galaxies 
at 2 ^ 0.45 ( Dawson et al. [[2013 1. Section discusses how 
redMaGiG can be improved upon if representative spec¬ 
troscopic subsamples of redMaGiG galaxies become avail¬ 
able. Section characterizes redMaGiG catastrophic fail¬ 
ures, which we take to mean 5cr outliers. A discussion and 
summary of our conclusions is presented in Section 

Fiducial cosmology and conventions: The con¬ 
struction of the redMaGiG galaxy samples requires one 


specify a cosmology for computing the comoving density 
of galaxies, and for estimating luminosity distances. To do 
this, we assume a flat AGDM cosmology with = 0.3 and 
h = 1.0 (i.e. distances are in h“^Mpc). This is the conven¬ 
tion used by redMaPPer. 

Finally, this work references both 2 -band magnitudes 
and galaxy redshifts. To avoid confusion, we denote 2 -band 
magnitudes via and reserve the symbol 2 to signify red¬ 
shift. Similarly, we refer to i-band magnitudes via rrn to 
distinguish from the counting index i. 


2 DATA 


2.1 DES Science Verification Data 

DES is a wide-held photometric survey in the grizY bands 


performed with the Dark Energy Camera (DECam, Diehl 
et al.|[2012 Flaugher et al.|[2015 f The DECam is installed 
at the prime focus of the 4-meter Blanco Telescope at Cerro 
Tololo Inter-American Observatory (CTIO). The full DES 
survey is scheduled for 525 nights distributed over hve years, 
covering 5000 deg^ of the southern sky, approximately half 
of which overlaps the South Pole Telescope (SPT, [Carlstrom' 
et al. ]MI] ) Sunyaev-Zel’dovich cluster survey. 

Prior to the commencement of regular survey operations 
in August 2013, from November 2012 to March 2013 DES 
conducted a ~ 300 deg^ “Science Verihcation” (SV) survey. 
The main portion of the SV footprint, used in this paper, 
covers the ~ 150 deg^ Eastern SPT (“SPTE”) region, in 
the range 65 < R.A. < 93 and —60 < Deck < —42. SPTE 
was observed between 2 and 10 tilings in each of the griz 
filters. In addition, DES surveys 10 Supernova fields every 5- 
7 days, each of which covers a single DECam 2.2 degree-wide 
field-of-view. The median depth of the SV survey (defined as 
lOa detections for extended sources) are g = 24.0, r = 23.9 
i = 23.0, 2 = 22.3, and Y = 20.8. 

The DES SV data was processed by the DES Data Man¬ 
agement (DESDM) infrastructure (Gruendl et al, in prep). 
This processing performs image deblending, astrometric reg¬ 
istration, global calibration, image coaddition, and object 
catalog creation. Details of the DES single-epoch and coadd 


processing can be found in Sevilla et al. (20111 and De- 


sai et al. (20121. We use SExtractor to create object cat¬ 


alogs from the single-epoch and coadded images (Bertin 


& Arnouts 1996 Bertin 20111. Object detection was per¬ 


formed on a “chi-squared” coadd of the r-|-i -|-2 image with 


SWarp (Bertin 20101, and object measurement was per¬ 


formed in dual-image mode with each individual griz image 
(here we ignore the shallow V-band imaging). 

After production of these early data, several problems 
were detected and corrected for in post-processing, leading 
to the creation of the “SVAl Gold” catalog (Rykoff et ah, in 
prep). First, unmasked satellite trails were masked. Second, 
calibration was improved using a modified version of the 
big-macs stellar-locus fitting code ( Kelly et al.[|2014] ^ We 
recomputed coadd zero-points over the full SV footprint on 


^ https://code.google.com/p/big-macs-calibrate/ 
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a HEALPix ( Gorski et al.|2005 l grid of NSIDE=256. These 
zero-points were then interpolated with a bi-linear scheme to 
correct the magnitudes of all objects in the catalog. Finally, 
regions around bright stars (J < 13) from the Two Micron 
All Sky Survey (2MASS; Skrutskie et al.|2006 1 were masked. 

Galaxy magnitudes and colors are computed via the 
SExtractor MAG_AUT0 quantity. These colors are significantly 
noisier than those obtained through model fitting. However, 
for SV coadd images MAG_AUT0 colors are considerably more 
stable due to PSF discontinuities in the coadded images 
sourced by coadding different exposures. This is expected 
to have a negative impact on our results, and future work 
will make use of full galaxy multi-epoch multi-band color 
measurements. 

Star-galaxy separation is a particularly challenging is¬ 
sue for red galaxy selection at high redshift. In particular, at 
2 : ~ 0.7 the red end of the stellar locus approaches the red 
sequence galaxy locus when using purely optical (griz) pho¬ 
tometry. Therefore, we have made use of the ngmix multi¬ 
band multi-epoch image processing (Sheldon et ah, in prep; 
Jarvis et ah, in prep) to select a relatively pure and com¬ 
plete galaxy selection. Details are presented in Appendix [X] 
As NGMIX is primarily used for shape measurements on DES 
data, the tolerance for input image quality is relatively tight, 
so our footprint is smaller than that of SVAl Gold (see Jarvis 
et ah, in prep). Finally, we only consider regions where the 
2 ;-band lOcr depth in MAG_AUT0 has > 22 (Rykoff et ah, in 
prep). In total, we use 148deg^ of DES SV imaging in this 
paper, and the angular mask is described in Appendix [B] 

We note that redMaGiC relies on the red sequence cal¬ 
ibration by the redMaPPer algorithm, as detailed in RMl. 
The DES SV redMaPPer cluster catalog is described in 
Rykoff et al. (in prep). We refer the reader to that work 
for a detailed description of the catalog. Here, we simply 
note that the redMaPPer calibration of the red sequence re¬ 
quires spectroscopic training data for galaxy clusters. This 
spectroscopic data set is primarily comprised of existing ex¬ 
ternal spectroscopic surveys, including the Galaxy and Mass 


Assembly survey (GAMA, Driver et al. 20111, the VIMOS 


VLT Deep Survey (VVDS, Garilli et al. 20081, the 2dF 
Galaxy Redshift Survey (2dFGR S, |Colless et al. |2001| , the 
Sloan Digital Sky Survey (SDSS, pVhn et al.||2013 l, the VI 


MOS Public Extragalactic Survey (VIPERS, Garilli et al. 
2014), the UKIDSS Ultra-Deep Survey ( [Bradshaw et ah 


2013 


McLure et al. 2013| UDSz,), and the Arizona CDFS 


Environment Survey (ACES, Cooper et al. 2012). In addi 


tion, we have a small sample of cluster redshifts from SPT 


used in the cluster validation (Bleem et al. 2015). These 


data sets have been further supplemented by galaxy spectra 
acquired as part of the OzDES spectroscopic survey, which 
is performing spectroscopic follow-up on the AAOmega in¬ 
strument at the Anglo-Australian Telescope (AAT) in the 
DES supernova fields ( Yuan et al.|2015 |. The total number 
of spectroscopic cluster redshifts used in our calibration is 
625, most of which are low richness. By point of comparison, 
current DES machine learning methods rely on over 46,000 
spectra. 

Figure shows the angular density contrast of our fidu¬ 
cial redMaGiC galaxy sample in the so called DES SV SPTE 


region. The full DES SV catalog also includes the DES su¬ 
pernovae fields, which are disconnected from the SPTE field. 
We note that very nearly all the spectroscopic training data 
sets reside in the DES supernovae field, which places signif¬ 
icant limitations in our ability to validate the performance 
of redMaGiC on the DES SV data set. 

We note that the survey depth varies significantly over 
the footprint. In some regions we can comfortably reach high 
redshifts {z < 1), while in other regions the depth is insuffi¬ 
cient. To obtain a homogeneous catalog across the full foot¬ 
print we restrict ourselves to redMaGiC galaxies over the 
redshift range 2 ; € [0.2,0.8]. 

2.2 SDSS DR8 Data 

We apply the redMaGiC algorithm to SDSS DR8 photo¬ 


metric data (Aihara et al. 2011). The DR8 galaxy cata¬ 


log contains « 14,000 deg of imaging, which we reduce to 
~ 10,000 deg^ of contiguous high quality observations us¬ 
ing the mask from the Baryon Acoustic Oscillation Survey 
(BOSS) ( Dawson et al.|2013 |. The mask is further extended 
to include all stars in the Yale Bright Star Catalog ( Hoffieit 
fc Jaschd^|1991 1, as well as the area around objects in the 
New General Catalog (NGC Sinnott 1988). The resulting 
mask is that used by Rykoff et al. (2014) to generate the 


SDSS DR8 redMaPPer catalog. We refer the reader to that 
work for further discussion on the mask. 

Galaxies are selected using the default SDSS 
star/galaxy separator. We filter all galaxies with any 
of the following flags in the g, r, or i bands: SATUR 
CENTER, BRIGHT, TOO MANY PEAKS, and (NOT BLENDED OR 
NODEBLEND). Unlike the BOSS target selection, we keep ob¬ 
jects flagged with SATURATED, NOTCHECKED, and PEAKCENTER. 
A discussion of these choices can be found in RMl. Total 
magnitudes are determined from i-band CMODEL_MAG and 
colors from ugriz M0DEL_MAG. 

The red sequence model is that of the SDSS DR8 
redMaPPer v6.3 cluster catalog (Rykoff et ah, in prep). This 
catalog is an updated version of the redMaPPer catalog in 


RMl (v5.2), and supersedes both it and the update in Rozo 
et al. ( 2014| v5.10). Spectroscopic training data are drawn 
from the SDSS DRIO spectroscopic data set (|Ahn et al. 

Msl. 


2.3 SDSS Stripe 82 Data 

We apply the redMaGiC algorithm on SDSS Stripe 82 (S82) 
coadd data (Annis et al. 2011). The S82 catalog consists 
of 275 deg^ of ugriz coadded imaging over the equatorial 
stripe. The coadd is roughly 2 magnitudes deeper than the 
single-pass SDSS data. We use the same flag cuts as those 
used for the DR8 catalog. In addition, we clean all galax¬ 
ies with extremely large magnitude errors. Total magnitudes 
are determined from i-band CMODELJIAG and colors from griz 
MODELJIAG. Most modest to high redshift {z > 0.3) red galax¬ 
ies in S82 are w-band dropouts, so we opted to rely exclu¬ 
sively on griz photometry for S82 runs. However, in Sec¬ 
tion we demonstrate the utility of the u-band imaging at 
low redshift. 


(c) 0000 RAS, MNRAS 000, 000-000 



































































4 E. Rozo, et. al 


-45.0° 


-50.0° 


o 

CD 

Q 


-55.0° 


-60.0° 


SVA1 (SPT-E) 





■ ■'^.<1' i* ■•#' 

/ k ^ d' .Tfc v. 

/' • **-;'^- » * ■* 

Vv*. ii v.^ •«■ -3^ ?>*>*^ 


90.0° 


80.0° 70.0° 

RA 


60.0° 


0.4 


HO.2 

CD 
C 
CD 

Q 

CD 

Ho.o ■'^ 

CD 


CD 

cr 


I - 0.2 


-0.4 


Figure 1. Angular galaxy density contrast 5 = {p — p)/p for DES SV redMaGiC galaxies in the redshift range [0.2,0.8], averaged on a 
15' scale. This plot uses our fiducial redMaGiC sample (see text). 


We have run the redMaPPer algorithm in this photo¬ 
metric data set, using SDSS DRIO spectroscopy as the spec¬ 
troscopic training data set. In addition, for high redshift 
performance validation we make use of VIPERS (VIPERS, 
Franzetti et al.|[2014h During our testing and validation of 


the redMaPPer catalog on these data, we discovered that 
« 15% of red cluster member galaxies in the S82 data set 
have reported magnitudes that are clearly incorrect in one 
or more bands. We do not know the origin of this failure, nor 
whether it extends to other galaxies (blue cluster galaxies 
or field galaxies). These errors inevitably bias the result¬ 
ing cluster richness estimates. Consequently, we have opted 
not to release the S82 redMaPPer and redMaGiC catalogs. 
Nevertheless, we include a discussion of these data because 
the photo -2 performance of redMaGiC in this data set pro¬ 
vides a valuable baseline to compare against the DES SV 
redMaGiC sample. 


(ii) Given 2photo, compute the galaxy luminosity L. 

(iii) If the galaxy is bright {L ^ Lmin), and it is a good 
fit to the red sequence template (x^ ^ Xmax), include it in 
the redMaGiC catalog. Otherwise, drop it. 

As long as Xmax is sufficiently aggressive, the resulting cat¬ 
alog will be very nearly comprised of red sequence galaxies 
exclusively. In addition, if the red sequence photometric tem¬ 
plate is accurate, then the resulting redshifts should be of 
excellent quality. In what follows, we describe how we con¬ 
struct our red sequence template, and how the maximum 
goodness-of-fit value Xmax is selected so as to ensure that 
the resulting redMaGiC galaxy sample has a constant co¬ 
moving space density. It should be note that our template 
is not a spectroscopic template. Rather, we model the col¬ 
ors as a function redshift and magnitude directly, without 
ever going through a spectrum. When we refer to redMaGiC 
template, we always mean our model colors. 


3 THE redMaGiC SELECTION ALGORITHM 

The redMaGiC algorithm can be summarized very simply: 

(i) Fit every galaxy to a red sequence template. Compute 
the corresponding best fit redshift 2photo, and the goodness- 
of-fit x^ of the template fit. 


3.1 The redMaGiC Template 

The redMaGiC algorithm relies on the redMaPPer calibra¬ 
tion of the red-sequence, so we begin our discussion by re¬ 
viewing how the redMaPPer template is constructed. Let c 
be the color vector of a galaxy and m denote the galaxy’s 
magnitude in some reference band. When possible, the ref- 
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erence band should lie redwards of the 4000 A break at all 
redshifts, which leads us to select niz as the reference mag¬ 
nitude for the DES redMaGiC sample. The lower redshift 
range of the SDSS catalogs allows us to use rrii in those 
data sets. One could in principle use rriz in SDSS as well, 
but since SDSS rrii is much less noisy than we rely on 
i-band for the SDSS data. 

Red sequence galaxies populate a narrow ridgeline in 
color magnitude space, though with some intrinsic scatter, 
which we model as Gaussian. In this case, the ridgeline cor¬ 
responds to the mean color of red sequence galaxies. We 
write 


{c\m,z) = sl{z) + a{z)(m — mrei{z)). ( 1 ) 

Here a{z) and oc{z) are the unknown redshift-dependent 
amplitude and slope of the red sequence. The magnitude 
mrei{z) defines the pivot point of the color-magnitude rela¬ 
tion. Its value is arbitrary and can be freely chosen by the ex¬ 
perimenter. redMaPPer selects mzet{z) so that it traces the 
median magnitude of the cluster member galaxies. The un¬ 
known functions a( 2 ) and a{z) are parameterized via spline 
interpolation, with the model parameters being the value of 
the functions at a grid of redshifts. 

The covariance matrix Cint characterizing the intrin¬ 
sic width of the red sequence in multi-dimensional color 
space is assumed to be independent of magnitude. The co- 
variance matrix is, however, assumed to vary as a function 
of redshift. As with the functions a( 2 ) and oc{z), the ma¬ 
trix Cint ( 2 ) is parameterized via spline interpolation, with 
the model parameters being the values of each independent 
matrix element along a grid of redshifts. Together with the 
parameters for a{z) and ol{z), this set of model parameters p 
fully specifies the color distribution of red sequence galaxies 
P(c|p;m, 2 ). 

The parameters p specifying our color model are fit us¬ 
ing an iterative maximum likelihood approach. Briefly, given 
a cluster galaxy with a spectroscopic redshift Zspec, and a 
rough estimate for the parameters p, one can photometri¬ 
cally select cluster galaxies using a matched-filter approach. 
Given these initial photometric cluster members, one then 
defines the likelihood 

^(p) ~ |p5 nii ,2^cluster) (2) 

where the product is over all the selected cluster members. 
In practice, the likelihood is modified to allow for contami¬ 
nation by interlopers ( [Rykoff et al.|20T4) ). A new set of pa¬ 
rameters p is estimated by maximizing the above likelihood, 
and the whole procedure is iterated until convergence. For 
further details, see RMl. The end result of the above pro¬ 
cedure is a strictly empirical calibration of the red sequence 
of cluster galaxies as a function of redshift. 


3.2 redMaGiC Photometric Redshfits 

We want to estimate the photometric redshift of a galaxy of 
magnitude m and and color c. We use an updated version of 
the photometric redshift estimator z^ed introduced in RMl. 
The probability that a red galaxy selected from a constant 


comoving density sample have redshift z, magnitude m, and 
color c is denoted via P{c,m,z). One has 

P{c,m,z) = P{c\m,z)P{m\z)P{z). (3) 


We are interested in the redshift probability distribution 


P{z\c,m) 


P{c,m,z) 

P(c,m) 

P{c\m,z)P{m\z)P(z) 

P{c,m) 


( 4 ) 

( 5 ) 


Since the denominator is redshift independent, we can ignore 
it. The corresponding likelihood is 


C{z) = P{c\m,z)Plrn\z)P{z). 


( 6 ) 


For a constant comoving density sample P{z) oc 
\dV/dz\. P{Tn\z) is modeled assuming the galaxies follow 
a Schechter luminosity function, 

P{m\z) oc exp |‘_^Q-o.4(™-m.)j _ 

The value mt,{z) is set to rrii ~ 17.85 at 2 ; = 0.2 to match 
redMaPPer. The evolution of mt{z) is computed using the 
Bruzual fc Gharlot ( 2003[ BC03) stellar population synthesis 
code as implemented in the EzGal Python packag^ We 
model m*(z) using a single star formation burst at 2 ; = 3, 
and we have conhrmed this evolution matches that in RMl 
at 2 < 0.5. The normalization condition for rriz for DES 
is then derived from the BG03 model using the DECam 
passband. Finally, P(c|m, 2 ) of our red sequence model, so 
that 


P(c|m, 2 ) oc exp ( - ^X^iz) 


( 8 ) 


where 


X^iz) = (c - (c|m, 2 ))Cti(c - (c|m, 2 )) (9) 


and 


Ctot — Cint + Co 


( 10 ) 


is the total scatter about the red sequence color. Here, Cobs 
is the covariance matrix describing the photometric errors 
in the galaxy colors. Our hnal expression for the redshift 
likelihood is therefore 


In £( 2 ) 


1 

2 


X^G) + In P{m\z) + In 


d\^ 

dz 


( 11 ) 


The photometric redshift 2red is the redshift at which 
this log-likelihood function is maximized, and the corre¬ 
sponding x^ value is denoted Xred- In addition, the galaxy 
is also assigned a luminosity I — I//L*(2red), 

l{m,Zred) = (12) 

The photometric redshift error <Jz is estimated using the 
variance of the posterior, 

- (^)^ ( 13 ) 


^ http://www.baryons.org/ezgal 
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where 

„ _ j dz L{z)z^ 
J dz C{z) 


(14) 


3.3 Selection Cuts 

We wish to select luminous red galaxies. Consequently, we 
demand that all galaxies have a luminosity I ^ Zmin, where 
Imin = I/min/i* is a Selection parameter that is to be deter¬ 
mined by the experimenter. To ensure that our final galaxy 
sample is comprised of red sequence galaxies, we further 
demand that our red sequence template be a good fit by 
applying the selection cut 

Xred ^ X max (Zred). (15) 

Note the cut can be redshift dependent. The 

simplest possible model is Xmax(-2) = k for some constant k, 
but this is rather arbitrary. What we really want is to be 
able to select the “same” sample of galaxies at all redshifts. 
In the absence of merging, red sequence galaxies evolve pas¬ 
sively, resulting in a constant comoving density sample. Of 
course, galaxies do merge, so this approximation cannot be 
exactly correct, but this can nevertheless be a useful ap¬ 
proximation for comparing galaxies across relatively narrow 
redshift intervals. Thus, rather than applying a constant 
X^ cut, we construct the selection threshold Xinaxi^) such 
that the resulting galaxy sample has a constant comoving 
galaxy density. This selection also justifies our assumption 
that P{z) oc \dV/dz\ in the construction of the redshift like¬ 
lihood. 

To ensure a constant comoving space density of red- 
MaGiC galaxies, we parameterize Xmax)-*) using spline pa¬ 
rameterization. The model parameters q are the values of 
Xmax along a grid of redshifts, and the value of Xmax(2) 
everywhere else is defined via spline interpolation. We will 
come back to how the parameters q are chosen momentarily. 
Before we do so, however, we need to describe an additional 
calibration step we take in order to improve the photometric 
redshift performance of the redMaGiC algorithm. 


3.4 Photo-z Afterburner 

The redMaGiC selection cuts are fully specified by the pa¬ 
rameter Zniin and the parameters q defining the function 
Xmax(-z)- If a random fraction of the selected galaxies have 
spectroscopic redshifts Zapec, we can use these galaxies to 
remove any biases in our photo-zs. For instance, given the 
redMaGiC selection specified by Imin and q, we could split 
the spectroscopic galaxies in two, a training sample and a 
validation sample. We can then use the training sample to 
compute the median redshift offset Zspec —Zred in bins of Zred. 
We denote this quantity as Az(zred). Our new photometric 
redshift estimator is 

Z-rni ~ ^red T Az(Zred)) (15) 

which we can validate with the validation data set. 


In practice, Az(zred) is defined using spline interpola¬ 
tion, with the spline parameters being determined by mini¬ 
mizing the cost function 

F/A = ^ ) IzispeCjj (IF) 

j 

where the sum is over all spectroscopic redMaGiC galaxies. 
We add the absolute values rather than the squares to reduce 
the impact of possible catastrophic outliers. 

Of course, in general one is hardly assured spectro¬ 
scopic redshifts for a large representative sample of red¬ 
MaGiC galaxies. We overcome this problem by relying in¬ 
stead on redMaGiC galaxies that are members of redMaP- 
Per clusters (membership probability Pmem ^ 0.9), using the 
redMaPPer photometric cluster redshift z\ as the “spectro¬ 
scopic” redshift of the calibration galaxies. Roughly, the red¬ 
shift za is obtained by simultaneously fitting the ensemble 
of cluster galaxies with a single photometric redshift. It has 
already been shown that redMaPPer redshifts are unbiased 
and much more accurate than the photometric redshifts of 
individual galaxies. We emphasize that by making use of 
photometric cluster members our calibration sample is not 
restricted to the brightest redMaGiC galaxies, as would be 
the case of a typical spectroscopic calibration sample. 

In addition to modifying the photometric redshift es¬ 
timate Zrm, we also modify the photometric redshift errors. 
Imagine again binning the galaxy calibration sample by Zrm. 
For each bin, we could compute the Median Absolute De¬ 
viation MAD = median! I Zred — za|}. For a Gaussian dis¬ 
tribution, {MAD) = (Tz/ 1.4826, where Gz is the standard 
deviation. Thus, the quantity 1.4826|zrm — z:a| is an estima¬ 
tor for (Jz- Let then (Tq be our original photometric redshift 
error estimate as per Section [3.2[ We assume that the cor¬ 
rected photometric redshift error cri for each galaxy is given 
by cTi = r(Zrm)o'O) where r(zrm) = o-^/cto- Rather than doing 
this in bins, we parameterize r(z) via spline interpolation, 
with the best fit parameters being those which minimize the 
cost function 

Ea = ^ ) j 1.4826| Zrnijj ^{Zzm^j'jGQ^j |. (I^) 

3 

The sum is over all calibration galaxies, and we again use 
absolute values to reduce the impact of possible catastrophic 
outliers. We note that the afterburner perturbations to the 
photometric redshifts are small, but do improve photometric 
redshift performance. 

With the new estimator Zrm in hand and its improved 
error estimate, we can recompute the luminosity I and x^ of 
every galaxy in the survey, and reapply our selection cuts to 
arrive at an improved redMaGiG sample. 


3.5 Xmax Calibration 

We have seen how to select redMaGiG galaxies given the 
selection parameters q, but we have yet to specify how the 
parameters q are selected. To do so, we first define a series 
of redshift bins Zj going from the minimum redshift of in¬ 
terest Zmin to the maximum redshift Zmax- Given a set of 
selection parameters q, we construct the redMaGiG sample 
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by applying the luminosity and cuts as above. Next, we 
compute the photo -2 afterburner parameters for the sample 
derived from the parameters q, which allows us to compute 
2rm for every galaxy. We then measure the comoving space 
density Uj (q) in each redshift bin j. Since we want to enforce 
a constant comoving density n, we define the cost function 
E{q) via 


j 


{nj{q)-nf 

nV-^ 


(19) 


ing volume of redshift bin j, rij is the empirical redMaGiC 
galaxy density in redshift bin j. The denominator is the 
expected Poisson error for a galaxy density n. The spline 
parameters q are obtained by minimizing the cost function 


E{q) using the downhill-simplex method of Nelder & Mead 


(19651. We always use redshift bins that are significantly 
narrower than the spacing between spline nodes, and we 
take care to ensure that the number of galaxies njVj S> 1 
in every redshift bin. We emphasize that the photo -2 after¬ 
burner parameters are re-estimated at every iteration in the 
minimization, to ensure that we have a consistent sample 
selection given the updated galaxy redshifts. Finally, with 
the spline parameters determined, we apply the correspond¬ 
ing x?ed ^ Xmax(-Zrm) cut to arrive at the final redMaGiC 
galaxy sample. 


3.6 Selection Summary 


Despite the computational complexity of the above selec¬ 
tion, it is worth emphasizing that our selection algorithm 
contains only two free parameters, both of which have clear 
physical interpretations: the luminosity cut /min, and the 
desired space density h of the resulting galaxy sample. Im¬ 
portantly, the “color cuts” that select red-galaxies are self- 
trained from the data. By comparison, the SDSS CMASS 
galaxy selection involves 12 parameters hand-picked a pri¬ 
ori to produce an approximately stellar-mass limited sample 
at 2 ^ 0.45 ( Dawson et al.|[2013 l. 

It is also important to note that our selection makes 
it very easy to test different selection thresholds, allow¬ 
ing one to optimize galaxy selection for scientific purposes. 
Some patterns emerge: /min must always be low enough 
for the corresponding Xmax threshold to be reasonable (i.e. 
X^/dof < 2). If /min is too large, redMaGiC will start pulling 
in galaxies with large x^ values in order to attempt to reach 
the desired space density, which will result in a large num¬ 
ber of photo -2 outliers. We find that when this happens it 
becomes difficult to construct a truly flat n{z) sample, so 
checking the comoving space density of the redMaGiC cat¬ 
alog is a quick an easy way to test whether the redMaGiC 
algorithm is performing as desired. 

We illustrate the performance of our algorithm in Fig¬ 
ure for a set of fiducial cuts /min = 0.5 and n = 
10“^ D Mpc“®. The left panel shows the threshold for 

each of our three redMaGiC samples, while the right panel 
shows the resulting galaxy comoving densities as a function 
of redshift. We see that in all cases the observed space den¬ 


sity is close to flat, and that the x^ thresholds are low, as 
desired. 


4 PHOTO-Z PERFORMANCE 

We consider two sets of redMaGiC galaxies. The first is our 
fiducial sample, selected to be galaxies brighter than 0.51/* 
and with a space density h = 10“® hr' Mpc“^. Unless oth¬ 
erwise stated, all of the results noted below correspond to 
these fiducial selection parameters. The second sample is a 
high luminosity, low space density redMaGiC sample, com¬ 
prised of galaxies brighter than L, with a space density of 
2 X 10“'* h7 Mpc“^. This high luminosity sample will be 
useful for comparing against other commonly used galaxy 
samples, particularly CMASS. 

Figure shows the photometric redshift performance 
for our fiducial selection in the SV, DR8, and S82 data sets. 
The spectroscopic data used to characterize the photometric 
redshift performance were described in Section The pho¬ 
tometric redshift bias A 2 is defined as the median offset of 
A 2 = 2apec — 2rm. The Scatter is defined as 1.4826 x MAD, 
where MAD is the median absolute deviation, i.e. the me¬ 
dian of IA 2 — A 2 I. For Gaussianly distributed data, A 2 and 
.4826 X MAD are unbiased estimators of the mean and stan¬ 
dard deviation of these offsets. In using median statistics, 
our results are robust to a small fraction of gross outliers. 

The most obvious features in the left-hand plots of Fig- 
ure|^are the three clumps of outlier points. These are obvi¬ 
ous for both DR8 and S82 data, but not apparent in the DES 
SV data. We are confident this reflects the paucity of spectra 
in the DES data rather than a sudden and unexpected im¬ 
provement in the redMaGiC performance. We discuss each 
of these clumps in Section 

Turning to the bias and scatter plots in the right column 
of Figure we see that for all data sets there is excellent 
agreement between the observed redshift scatter (red solid 
line) and the predicted photo -2 uncertainty (dashed blue 
line). The latter is simply the median photo -2 error in each 
bin. Note that the predicted redshift errors in the SDSS S82 
and DES SV data sets are clearly double-humped. This is 
expected: photometric redshift uncertainties increase when¬ 
ever the 4000 A break feature in the spectra of these galaxies 
falls in between filters. At 2 ~ 0.35 there is a peak associ¬ 
ated with the gr to r filter transition, and at 2 ~ 0.65 we see 
a second peak associated with the r to i filter transition. 

Comparing the three data sets, we see DR8 and S82 
have nearly identical photometric redshift errors at low red- 
shifts, which demonstrates that the redshift errors are set 
by the intrinsic width of the red sequence. By contrast, at 
2 > 0.4 the photometric errors in DR8 are clearly impor¬ 
tant, and so its photo -2 errors are larger than those in S82. 
Notably, DES has larger photometric redshift scatter than 
the SDSS data sets. There are several contributors to this 
result. First, the spectroscopic training set for redMaPPer 
training is still quite sparse, and so the redMaPPer calibra¬ 
tion is expected to be noisier than in the SDSS data sets. 
Second, DES SV MAGAUTO colors are expected to be intrin- 
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Figure 2. Left: Selection cut Xmax ^ function of redshift defining each of the redMaGiC galaxy samples, as labelled. The symbols 
mark the spline nodes defining the function Xmax(^)i while the lines show the corresponding spline interpolation at every point. Right: 
redMaGiC comoving galaxy density as a function of redshift for each of the three data sets employed in this work, as labelled. The target 
comoving space density was 10~^ Mpc“® (horizontal dotted line). 


sically noisier than SDSS M0DEL_MAG colors, leading to larger 
uncertainties. 

Turning to the bias, we see that the DR8 redMaGiC 
sample appears to have a negative bias at ~ 0.3. By 
contrast, the S82 sample exhibits a slight positive bias at 
the same photometric redshift. The situation reverses at 
t; « 0.25, with S82 galaxies exhibiting bias while DR8 
galaxies do not. We believe these biases are driven by non¬ 
representative spectroscopic sampling of redMaGiC galax¬ 
ies. Specifically, our photometric redshift tests rely on the 
subset of redMaGiC galaxies that have spectra. If that sub¬ 
set is biased relative to the full population, we would in fact 
expect to see a photometric redshift bias. 

Figure shows redMaGiC galaxy density contours in 
the g — rvsr — i plane for several photometric redshift bins. 
The filled red and orange contours show the regions con¬ 
taining 68% and 95% of all redMaGiC galaxies with spec¬ 
troscopic redshifts. The solid ellipses show the corresponding 
regions for all redMaGiC galaxies with a magnitude thresh¬ 
old set by the spectroscopic redMaGiC sub-sample. Offsets 
between the red-orange contours and the solid line contours 
imply a non-representative spectroscopic sampling of the 
redMaGiC galaxy population. 

It is clear from Figure]^ that DR8 spectroscopic sam¬ 
pling is biased at 2 > 0.3, with the reddest galaxies start 
being somewhat over-sampled. There is a similar trend of 
over-sampling the reddest redMaGiC galaxies in S82 start¬ 
ing at 2 « 0.23. These differences appear to be correlated 
with the presence of “large” photo -2 biases in Figure 

The photo -2 bias at 2 « 0.6 in the S82 data is rather un¬ 
usual. It is large and negative (« —0.005) when using SDSD 
spectroscopy, but large and positive (« 0.009) when using 
VIPERS. The difference between the two spectroscopic data 
sets further highlights the importance that spectroscopic 
sampling can have on our conclusions. 


The origin of the redshift biases in the DES SV red¬ 
MaGiC sample are much more difficult to ascertain. First, 
the spectroscopic training set for redMaPPer is very sparse, 
and is most certainly not representative of the sample as a 
whole. For instance, there is a dearth of spectroscopic galax¬ 
ies at 2 ~ 0.4. A histogram of the number of redMaGiC 
galaxies as a function of redshift is shown in Figure]^ along 
with a contour plot showing how these galaxies populate 
the redshift-magnitude space. Second, most of the redshifts 
available to us come from training sets in the SN fields, 
adding up to « 30 deg^. The small area results in only 
a handful of spectroscopic clusters for red sequence calibra¬ 
tion. Third, our reliance on MAG_AUT0 colors in the DES is ex¬ 
pected to adversely affect photo -2 performance. Fortunately, 
all of these difficulties will be considerably ameliorated if not 
entirely removed as the DES images larger areas and updates 
the data reduction pipelines. 

A summary of the statistical performance of redMaGiC 
is presented in Table 

5 COMPARISON TO EXISTING PHOTO-Z 

ALGORITHMS 

5.1 DR8 Comparisons 

As noted in the introduction, redMaGiC seeks both to select 
galaxies with robust photometric redshifts, and to develop 
a photometric redshift estimator that can be used on these 
galaxies with minimal spectroscopic training data. For the 
latter to be useful, however, the performance of our algo¬ 
rithm must be comparable to that of existing algorithms. We 
now test how the redMaGiC photo- 2 s compare with state-of- 
the-art photometric redshift codes run on redMaGiC galax- 
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Photometric Redshift Photometric Redshift 

Figure 3. Left: Spectroscopic redshift vs photometric redshift for the fiducial redMaGiC galaxy sample in each of the various data 
sets considered in this work. Red points are 5 (t outliers, while the red line corresponds to 2spec = •^photo* Right: Photometric redshift 
performance statistics. Red points with error bars are the photometric redshift bias, defined as the median value of Zspec — -^photo- All 
statistics for the SDSS data sets are computed using SDSS spectroscopy, except for the purple VIPERS point for S82. The red curve is 
the observed scatter of (-Zphoto~2spec)/(l + 2spec), while the dashed blue curve is the predicted scatter based on the available photometry. 
The horizontal error bar for the S82 plot shows the width of the redshift bin used in the VIPERS measurement. 
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Figure 4. 68% and 95% galaxy density contours in g — r ws r — i space for DR8 and S82 redMaGiC galaxies for a variety of redsliift 
bins, as labelled. Red/orange contours correspond to redMaGiC galaxies with spectroscopic redshifts, while the solid black curves show 
the contours for the full redMaGiC sample. A mismatch between the colored and black ellipses implies biased spectroscopic sampling of 
redMaGiC galaxies. 


Table 1. Photometric redshift performance of redMaGiC galaxies. All quantities are first computed in redshift bins, and then the median 
of the binned values is reported. Bias and |Bias| are the median values for (zgpec — Zphoto) and jzspec — Zphotol respectively. The scatter 
is 1.4826 X MAD where MAD is the median absolute deviation of |2spec — z:photol/(l + Zspec). The predicted scatter is the median value 
of <^z/0- + z^photo) where az is the reported photo-z error. 


Space Density 

Redshift Range 

Data Set 

Bias 

|Bias| 

Scatter 

Predicted Scatter 

5cr Outlier Fraction 

10-3 Mpc-3 

2 

G 

[0.2,0.8] 

DES SV 

0.51% 

0.51% 

1.69% 

1.78% 

1.4% 


2 

G 

[0.1,0.65] 

SDSS S82 

0.17% 

0.39% 

1.10% 

0.97% 

2.2% 


2 

G 

[0.1,0.45] 

SDSS DR8 

-0.04% 

0.20% 

1.43% 

1.40% 

0.8% 

2 X 10“"^ Mpc-3 

2 

G 

[0.2,0.8] 

DES SV 

0.19% 

0.37% 

1.50% 

1.59% 

0.9% 


2 

G 

[0.1,0.65] 

SDSS S82 

0.14% 

0.22% 

1.04% 

1.03% 

1.5% 


2 

G 

[0.1,0.45] 

SDSS DR8 

-0.22% 

0.22% 

1.40% 

1.46% 

1.9% 


ies. We start with the SDSS data set. To make the com¬ 
parison as fair as possible we rely on the high luminosity 
(L ^ L*), low space density redMaGiC sample, as the typ¬ 
ical magnitudes of these galaxies are closer to the magni¬ 
tudes of the galaxies with spectroscopic redshifts. Note this 
high luminosity redMaGiC sample goes up to a maximum 
redshift z = 0.55 rather than the z = 0.45 redshift we could 
achieve with the low luminosity sample. However, we restrict 
our attention to z G [0.1,0.5] rather than z G [0.1,0.55]. This 
is because for z ^ 0.5, the spectroscopic sampling of red¬ 
MaGiC galaxies becomes increasingly biased, as illustrated 
in Figure]^ 


We consider three photo-z algorithms. The first set of 


photo-zs are those included with SDSS DR7 (Abazajian 


et al. 20091, which we shall refer to simply as the SDSS 


photo-zs. These were obtained through a hybrid method 
that combines the spectral templates of [Budavari et al. 
(20001 with the machine learning method of 


Csabai et al. 


(20071. A second set of photo-zs we compare against are 
those from Hoyle et al. (20151, which we will refer to as 
the RDF photo-zs. This algorithm uses a combination of 


decision trees and feature imporance to derive photomet¬ 
ric redshift estimates. RDF photo-zs use 85 galaxy features 
with a 60%/40% split for training and validation. Finally, 


we utilize the publicly available code ANNZ (Collister & La- 
hav 2004| to estimate the redshifts of redMaGiC galaxies. 


This choice is motivated by the results of [Abdalla et al.| 
(20111, who performed a detailed comparison of six photo¬ 
metric redshift algorithms, and found ANNZ performed best 
in luminous red galaxy samples. We train ANNZ with 2/3 
of the full spectroscopic training sample, and test on the re¬ 
maining 1/3. The neural net had 5 input nodes (4 MODELJIAG 
galaxy colors, and a total rrii, for which we use CM0DEL_MAG). 
We utilized two hidden layers of 10 nodes each, as per the 
standard architecture. 


A comparison of the redMaGiC photo-z to the SDSS 
photo-zs is shown in Figure]^ We find the SDSS photo-zs 
are slightly less biased than the redMaGiC photo-zs, but 
have nearly identical scatters. The SDSS photo-zs also do a 
better job of error characterization, though the difference is 
not large. The picture is much the same for ANNZ, except that 
ANNZ grossly underestimates the photometric redshift scat- 
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Figure 5. Left: dNjdz histogram for the fiducial redMaGiC galaxy sample. The dotted line is the expected distribution for a constant 
comoving density sample. The red histogram is the redMaGiC data binned by our photometric redshift estimate. The blue histogram 
shows the number counts for the redMaGiC sample with spectroscopic redshifts, boosted by a factor of 10 for clarity. Right: Contours 
containing 68%, 95%, and 99% of redMaGiC galaxies (colored contours) or redMaGiC galaxies with spectroscopic redshifts (solid 
contours). The dearth of galaxies a,t z ~ 0.4 and the relative excess of bright galaxies in the spectroscopic sample is apparent. 



Figure 6. Distribution of redMaGiC galaxies in the photometric redshift bin ^photo S [0.54,0.55]. Orange/red contours show the color 
distribution of redMaGiC galaxies with spectroscopic redshifts, while the solid ellipses show the distribution of all redMaGiC galaxies. 
The large offsets between the two sets of ellipses are due to biased spectroscopic sampling of the redMaGiC galaxies. 


ter (not shown). RDF redshifts are clearly superior to the 
SDSS, ANNZ, and redMaGiC photo-zs, though the improve¬ 
ment remains modest: the scatter decreases from 1.48% in 
redMaGiC to 1.28% in RDF (not shown). The agreement 
between the ANNZ, SDSS, and redMaGiC redshifts strongly 
suggest that the improvement seen with RDF is primarily 
due to the large number of features used (85 observables), 
rather than more optimal use of the limited information used 
in redMaGiC (5 bands). 

A quantitative summary of these results is presented 
in Table Also reported there are the fraction of galax¬ 
ies where jzphoto — 2spec|/(l + Zspec) ^ 0.07, corresponding 


roughly to 5a for redMaGiC galaxies. This number char¬ 
acterizes how large the tails of the photo-a errors are. All 
methods we consider here have comparable tails. 

We caution, however, that these tests represent a best- 
case scenario for training set methods. Specifically, machine 
learning methods do not extrapolate outside their train¬ 
ing sets very well. Consider red galaxies as a specific ex¬ 
ample. Because the red sequence is tilted, a faint red se¬ 
quence galaxy will appear bluer than a bright red sequence 
galaxy. Consequently, red sequence galaxies fainter than the 
training data set of a machine learning algorithm will have 

^photo ^ 2:spec. 
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Figure 7. Left: Comparison of the photometric redshift performance of redMaGiC (red) and SDSS photo-zs for redMaGiC galaxies 
(blue). This plot uses SDSS spectroscopic redshifts to compute the redshift bias and scatter of the redMaGiC photo-zs, and is therefore 
limited to the brightest redMaGiC galaxies. Points with error bars show the median redshift bias for each of the two samples. Solid lines 
show the observed photo-z scatter, while dashed lines show the predicted scatter. Right: As for the left panel, only now we test the 
photo-z performance of the sub-sample of redMaGiC galaxies that are members of redMaPPer clusters. For these galaxies, we assign 
the photometric redshift of the host redMaPPer clusters as the “spectroscopic” redshift of the redMaGiC galaxy for the purposes of 
computing photometric redshift biases and scatter. By doing so, we can test the accuracy of the photometric redshifts of faint redMaGiC 
galaxies with no spectroscopic redshift. 


We can indirectly verify this expectation by looking 
at members of galaxy clusters. Specifically, we select all 
redMaPPer high probability (membership probability ^ 
90%) cluster members, and assign to all such members a 
“spectroscopic” redshift equal to the photometric cluster 
redshift. We then compare the redMaGiC and SDSS photo- 
zs of these galaxies to their assigned cluster redshifts. The 
redshift bias Zduster — 2photo and corresponding scatter are 
shown in the right panel of Figurej^ We see that our expec¬ 
tation that Zphoto ^ 2spec is borne out by the data, and that 
the bias can be large, « 0.02. At very high redshifts, the 
luminosity threshold in redMaPPer approaches the spectro¬ 
scopic magnitude limit, and so the bias starts to decrease 
with redshift. 

The main take aways from these test are that redMaGiC 
photo-zs perform as well the best machine learning meth¬ 
ods run with the same photometric input. However, machine 
learning methods can improve on redMaGiC by exploiting 
additional data. Critically, however, machine methods do 
not extrapolate well, and appear to be subject to large red¬ 
shift biases for galaxies that are not well represented in the 
training data sets (however, see Hoyle et al.|2015 I. Because 
of how the redMaGiC algorithm is structured, this is not a 
problem for redMaGiC photo-zs. 


5.2 DES Comparisons 


We compare redMaGiC photozs to two algorithm cur¬ 


rently in use within the DES collaboration (Sanchez et al. 


2014), specifically SkyNet and BPZ photo-zs. SkyNet is 


a machine learning method that relies on neural networks 


to “classify” galaxies into redshift bins (Graff et al. 2014 
|Bonnett|2015[), while BPZ is a popular template based code 


( Bem'tez| 20001. We use BPZ with its default configuration 
(8 templates, INTERP=2, and we do not allow for zero point 
offsets). While there are other machine learning methods 
available in DES, they all have comparable performance, so 
we have arbitrarily chosen to focus on SkyNet to simplify 
our analysis. 


Figure [^compares the performance of SkyNet on the 
redMaGiC galaxy sample to that of the redMaGiC photo-zs. 
The two algorithms perform equally well in terms of photo-z 
biases and scatter. However, SkyNet grossly overestimates 
the photometric redshift uncertainty, with the SkyNet pre¬ 
dicted uncertainties being a factor of 3.5 times larger than 
the observed errors. This is not unexpected: SkyNet and 
the other machine learning codes used in the DES SV data 
have their photometric redshifts smoothed and broadened 
(for details, see Appendix C in Bonnett et al. in prepa¬ 
ration), which improves photo-z performance for lensing 
sources, but, as evidenced here, has a deleterious effect on 
the photo-z error estimates for redMaGiCgalaxies. SkyNet 
and redMaGiC also exhibit similar tails. 


BPZ performs very poorly at low redshifts, exhibiting a 
redshift bias of « 0.1. The bias decreases to ~ 0.02 at higher 
redshifts, but remains well above the SKYNET/redMaGiC 
biases. The redshift scatter for BPZ is comparably to that 
of SKYNET/redMaGiC, but the uncertainties are overesti¬ 
mated by a factor of « 6. Nearly 12% of all galaxies have 
|zapec — z:photo|/(l + z:spec) ^ 0.08 for BPZ, compared with 
Rs 1.4% for redMaGiC/SKYNET. 
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Table 2. As Table [T] but comparing the redshift performance of different photo-z algorithms on redMaGiC galaxies. We only consider 
the redMaGiC sample with space density 2 X Mpc“®. The redshift range of consideration is z S [0.1,0.5] for DR8, and [0.2,0.8] for 

DES. “Bad fraction” is the fraction of galaxies where jzpijoto — Zspec|/(1 + .^spec) 1? 0.07 (for SDSS) or is 0.08 (for DES), corresponding 
roughly to 5a for redMaGiC photo-zs. DR8 Spec AB data sets correspond to redMaGiC with a spectroscopic afterburner (see section]^. 


Data Set 

Bias 

|Bias| 

Scatter 

Predicted Scatter 

Bad Fraction 

SV redMaGiG 

0.35% 

0.35% 

1.82% 

1.80% 

1.4% 

SV SkyNet 

-0.36% 

0.59% 

1.58% 

5.31% 

1.1% 

SV BPZ 

1.48% 

2.95% 

1.59% 

9.821% 

11.6% 

DR8 redMaGiC 

-0.23% 

0.23% 

1.48% 

1.39% 

1.4% 

DR8 SDSS photo -2 

-0.00% 

0.02% 

1.37% 

1.38% 

1.3% 

DR8 RDF photo -2 

0.01% 

0.03% 

1.25% 

1.28% 

1.3% 

DR8 ANNZ photo -2 

-0.09% 

0.13% 

1.33% 

1.29% 

1.5% 

DR8 Spec AB 

0.01% 

0.03% 

1.49% 

1.47% 

1.1% 



Photometric Redshift 


Figure 8. As Figure]^ only now we compare SV SkyNet photo- 
zs (blue) to SV redMaGiC photo-zs(red). The predicted SkyNet 
scatter is not shown, as the SkyNet predicted error are a factor 
of 3.5 larger than the observed scatter. 


Our results confirm the basic picture we obtained from 
the DR8 comparisons: redMaGiC performs as well as the 
best performing machine learning methods, despite not re- 
qniring representative spectroscopic training samples. BPZ 
performance is especially poor. Importantly, redMaGiC con¬ 
tinues to have extremely well characterized scatter, whereas 
SkyNet/BPZ do not. 


6 WHY SELECTION MATTERS 

The primary motivation of the redMaGiC algorithm is not to 
improve upon existing photometric redshift algorithms, but 
rather to select a galaxy sample with robust photo-zs. The 
results in the previous section clearly demonstrate that red¬ 
MaGiC galaxies do, in fact, have photometric redshifts that 
are both precise and accurate. In this section we investigate 
whether this feature is unique to the redMaGiC sample. In 
particular, we look at the current work-horse for large-scale 
structure measurements in the SDSS, the CMASS galaxy 


sample. CMASS galaxies were specifically selected to be 
roughly stellar mass limited at z ^ 0.45. Here, we test 
whether the redMaGiC selection can lead to improved pho¬ 
tometric redshift performance relative to CMASS. Note that 
any gains we make are not of critical important for spectro¬ 
scopic experiments, as such experiments are not sensitive 
to large photometric redshift scatter and/or catastrophic 
photo-z failures. 


A fair comparison of CMASS to redMaGiC galaxies is 
difficult. In particular, we’d like to compare samples that 
have comparable space densities (which control the errors 
in clustering signal) and luminosities (which set the photo¬ 
metric error uncertainty). For comparison purposes. Table 
quotes typical densities for a couple of standard SDSS galaxy 
samples, namely LRG ( Eisenstein et al)]|2001| , and LOWZ 
and CMASS ( Dawson et al.||2013 l Also shown is the mini¬ 
mum luminosity of galaxies in that sample at a typical red¬ 
shift. Densities for the standard SDSS samples are based on 
Figure 1 of Tojeiro et al. (20141. We see that even our bright 
redMaGiC sample has a comparable density to CMASS, but 
a lower luminosity threshold, reflecting the more stringent 
color cuts applied in redMaGiC. We will compare CMASS 
against this sample. Note CMASS galaxies are ~ 0.3 mag¬ 
nitudes brighter than the redMaGiC galaxies we compare 
against. This added noise should degrade the photomet¬ 
ric redshift performance in redMaGiC galaxies relative to 
CMASS. That is, the match-up is purposely stacked against 
redMaGiC for this comparison. 

Figurej^shows how galaxies fall in the Zspec“Z;photo plane 
for both CMASS (left panel) and redMaGiC (right panel). 


For the CMASS data set we rely on SDSS photo-zs (Csabai 
|et al.|2007| ), while we use redMaGiC photo-zs for redMaGiC. 
Note that redMaGiC and SDSS photo-zs had nearly identi¬ 
cal performance on redMaGiC galaxies, so the performance 
in the right-hand plot would be much the same if we replaced 
redMaGiC photo-zs with SDSS photo-zs. 


The benefit of the redMaGiC selection is immediately 
apparent: despite probing fainter galaxies, the redMaGiC 
galaxies have clearly better behaved photometric redshifts 
than those of CMASS. The photo-z scatter is 1.5% for red¬ 
MaGiC, and 2.1% for CMASS. In addition, the fraction of 
galaxies with large redshift errors (|Az|/(l + z) ^ 0.07) is 
much larger for CMASS (6.4%) than for redMaGiC (1.4%). 
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Figure 9. Left: Spectroscopic vs. photometric redshifts for CMASS galaxies using SDSS photo-zs. Colored regions contain 68%, 95%, 
and 99% of the points. The remaining 1% of galaxies are shown as points. The blue line is the y = x line. Right: As left panel, but for 
redMaGiC galaxies. 


We note that the photo-z scatter for CMASS galaxies quoted 


here is significantly lower than that reported in | Ross et al. 


2011 ). This is partly because we define scatter as crz/(l + 2), 
while |Ross et al. ] ( [MT| quote CTz, and partly because we 
estimate CTz using median statistics, while Ross et al. (20111 

use (Jz = (zapec “ 2photo)^, which is more sensitive to gross 
outliers than the MAD-based estimate. 

It is also clear from Figure that CMASS galaxies 
with Zspec 0.3 are particularly ill-behaved. This is not 
particularly problematic for experiments like BOSS, where 
the spectroscopic follow-up of the targets ensures that these 
contaminants don’t percolate into cluster measurements at 
z ~ 0.5. By contrast, a photometric survey would end up in¬ 
cluding those galaxies in its clustering measurements, lead¬ 
ing to systematic errors in the clustering signal. This further 
highlights the importance of redMaGiC selection for photo¬ 
metric large-scale structure studies. 

We can also compare the performance of the RDF pho¬ 
tometric redshifts in the CMASS sample to redMaGiC. Rel¬ 
ative to the SDSS photo-zs, RDF shows clear improvement: 
the scatter is reduced to 1.9%, and the fraction of galaxies 
with larger errors goes down to 2.2%. This is not surprising: 
RDF redshifts were trained on CMASS galaxies, whereas 
the SDSS photo- 2 S were not. This highlights the importance 
of training for machine learning methods, a weakness not 
shared by redMaGiC. Just as importantly, even RDF red¬ 
shifts for CMASS galaxies are worse than redMaGiC red¬ 
shifts for redMaGiC. 

In short, we find redMaGiC is extremely successful at 
identifying galaxies with robust photometric redshift esti¬ 
mates. Of course, CMASS was designed to be used for a 
spectroscopic survey, so the differences highlighted here are 
much less important in that case. For purely photometric 
surveys, however, our selection algorithm is clearly superior. 


Table 3. Typical space density and luminosity cuts for a variety 
of different SDSS galaxy samples. 


Sample 

Space Density 

Minimum Luminosity 


(/i-i Mpc-3) 


LRG 

1 X 10“"^ 

2.1 (at 2 = 0.35) 

LOWZ 

3 X 10-"^ 

1.6 (at 2 = 0.35) 

CMASS 

2 X 10“'^ 

1.5 (at 2 = 0.5) 

redMaGiC Bright 

2 X 10“"^ 

1.0 

redMaGiC Faint 

1 X 10-3 

0.5 


7 SPECTROSCOPIC TRAINING OF redMaGiC 

We consider whether from redMaGiC can be signifi¬ 
cantly improved with further spectroscopic training data. 
Specifically, in the redMaGiC algorithm, we use a photo- 
2 “afterburner” that relies on photometric cluster galaxies 
to help fine-tune our photo- 2 s. We now consider what hap¬ 
pens if we apply a further “afterburner” using spectroscopic 
redshift information for the redMaGiC sample. As a proof- 
of-concept, we use the redMaGiC galaxies that are in the 
SDSS DRIO spectroscopic catalog, and split the sample in 
half for training and validation. All results shown are for the 
validation sample only. 

For our spectroscopic afterburner, we apply the same 
procedure as outlined in Section |3.4| only now the initial 
redshift estimate is Zrm- We label our final redshift es¬ 
timate 2sAB (for spectroscopic afterburner). Similarly, we 
tweak the photo -2 error using median statistics as with 
our original afterburner. Having defined our new redMaGiC 
spectroscopically-trained photo -2 estimates, we test the red¬ 
MaGiC photo -2 performance using our test sample. The re¬ 
sults are shown in the left panel of Figure |10| The right 
panel of Figure shows a histogram of the quantity Az = 
(^^spec — 2aAB)/frzaAB- S'!! ^he photo- 2 S were Gaussian, un¬ 
biased, and we correctly estimated the photo -2 error, then a 
histogram of the quantity Az would be well fit by a Gaussian 
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of zero mean and unit variance. The right panel of Figure 
shows the Az histogram for the redMaGiC testing sample. 
The red Gaussian is not a best fit: it is a Gaussian of zero 
mean and unit variance. 

Given the improved performance for the spectroscopi¬ 
cally trained redMaGiC sample, why do we not adopt this 
procedure as part of the redMaGiC photometric redshift es¬ 
timate by default? As discussed in Section]^ biased spectro¬ 
scopic sampling of our data set will introduce unknown and 
uncontrolled biases in the resulting photometric redshifts. 
Consequently, we have opted not to apply this spectroscopic 
afterburner until a fully representative spectroscopic galaxy 
sample becomes available, or data augmentation techniques 
are advanced enough to extrapolate outside the training sets. 


8 UNDERSTANDING redMaGiC OUTLIERS 

We now investigate the photo-z outliers in the redMaGiC 
galaxy sample. We consider a galaxy an outlier if its photo- 
2 : is more than 5a away from its spectroscopic redshift. The 
outlier fraction of redMaGiC galaxies as a function of red¬ 
shift is illustrated in Figure pT] for both the fiducial and high 
luminosity samples. Perhaps the two most salient features in 
this plot are: 1) the difference in the outlier fractions at low 
redshifts between the SDSS DR8 and both the SDSS S82 
and DES SV data sets; and 2) the difference in the outlier 
fractions between the fiducial and high luminosity galaxy 
samples. The latter result is not surprising: the brighter the 
galaxy, the easier it is to distinguish between red sequence 
and non red sequence galaxies. We will return to the differ¬ 
ence between the DR8 and S82/DES SV momentarily. 

Consider first the DR8 outlier population. The mean 
DR8 outlier fraction is small, ~ 0.7%, and is split among 3 
sets of outlier clumps, as seen in Figure This last one is 
more readily apparent in the SDSS S82 data set. We consider 
each of these in turn. 

8.1 Clump 1: Low Redshift Outliers 

We compare the rest-frame spectra of outliers in Clump 1 
(the low redshift outliers in Figure to a control sample 
of non-outliers. The control sample is comprised of galax¬ 
ies with good photo-2S (within 0.5cr of Zspec = «photo). 
We randomly sample from the control sample so as to 
mirror the photo -2 distribution of the outlier sample. We 
median-stack the spectra of both sets of galaxies, arbitrar¬ 
ily normalizing them to unity over the wavelength range 
A = [5300 A, 5800 A]. We have further smoothed the spec¬ 
tra to make the resulting stacks easier to interpret by eye. 
The two stacked spectra and their difference are shown in 
Figure 

We find that the two spectra are largely consistent with 
each other for wavelengths A > 5000 A. At shorter wave¬ 
lengths, however, there is a clear excess of blue light in the 
photometric redshift outliers. In addition, the spectra of the 
outlier galaxies have obvious Hu and [Oil] lines, demon¬ 
strating these galaxies have ongoing star formation. 

Why is the fraction of outliers in Clump 1 is so much 



3000 4000 5000 6000 7000 8000 

X 


Figure 12. Top panel: Stacked rest-frame spectra for redMaGiC 
galaxies with 2photo S [0.18,0.22]. Outlier galaxies are shown in 
red (Clump 1 in Figure]^, and non-outliers in black. Also shown 
are the SDSS ugriz transmission curves for an extended source 
at 2 = 0.2 assuming 1.3 air masses (purple, blue, green, orange, 
red). Bottom panel: Difference between the two spectra in the 
top panel, showing the excess emission associated with the out¬ 
lier galaxy population. The vertical dotted lines mark the [Oil] 
(left-most line) and Ho (right-most line) emission lines. Clump 1 
galaxies have excess blue light, as well as [Oil] and Ha emission 
indicative of a small amount of residual star formation. 

larger in S82 and SV data sets relative to the DR8 sample? 
This is because the S82 and SV redMaGiC selection was 
based solely on griz photometry, while for DR8 we addition¬ 
ally included u-band photometry. As the u-band is sensitive 
to the enhanced star formation in Clump 1 galaxies, the 
relative contamination of these outliers is dramatically de¬ 
creased in DR8 relative to the S82 and SV data sets. While 
we did not use the S82 u-band in the construction of the 
redMaPPer and redMaGiC catalogs — its inclusion created 
problems with the higher redshift (2 ~ 0.5) cluster calibra¬ 
tion — we do have the data available for us to test our 
hypothesis. Figure shows S82 redMaGiC galaxies in the 
photometric redshift slice 2photo € [0.18,0.22]. Black points 
are galaxies where the spectroscopic redshift of the galaxy 
is within 2cr of our photometric estimate, while red points 
show ^ 5cr outliers. We see the vast majority of 5a outliers 
are unusually bright in u, as expected. 

8.2 Clump 2: Photo -2 Biased High 

We repeat the spectra-stacking procedure above for Clump 
2 galaxies (with photo -2 biased high in Figure!^. For rea¬ 
sons that will become apparent below, in Figure ]!^ we plot 
not the difference between the outlier and non-outlier spec¬ 
tra, but rather their ratios. Both sets of spectra have been 
normalized as before. A blue light excess is immediately ap¬ 
parent, and we again see both Ha and [Oil] emission. How¬ 
ever, the most salient feature is the slope of the flux ratio as a 
function of wavelength, with the outlier spectra having a sys- 
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Figure 10. Left: redMaGiC photometric redshift performance after training with a spectroscopic sub-sample of galaxies. Points with 
error bars show the bias in the recovered redshifts. The solid line shows the photometric redshift scatter, while the dashed line shows 
the predicted redshift scatter. Right: A histogram of the quantity A = (^spec — ^p\ioto)/^z where cr^ is the reported photometric 
redshift uncertainty. The blue histogram is for our fiducial redMaGiC sample, while the black histogram is for a spectroscopically trained 
redMaGiC sample. The red curve is not a fit. It is simply a Gaussian of zero mean and unit variance. 




Photometric Redshift Photometric Redshift 


Figure 11. 5(t outlier fraction for the fiducial and high luminosity DR8 (red), S82 (blue), and DES (black) redMaGiC samples as 
estimated using SDSS DR12 spectroscopy. 


tematically steeper continuum than the non-outlier galaxies. 
This slope is consistent with internal dust-reddening in the 
galaxy. Specifically, the dashed blue line is the predicted 
spectral ratio assuming an O’Donnell (19941 reddening law 
with E{B — V) — 0.15. 

It is worth noting the reasons why these dusty galaxies 
show up in our redMaGiC selection only at this particular 
redshift range. In particular, at most redshifts the rest-frame 
reddening vector with broadband griz photometry is not 
parallel to the color evolution vector of the red sequence. 
Consequently, at most redshifts a galaxy that starts in the 
red sequence and is reddened simply moves off the red se¬ 
quence, and is not selected. By contrast, at z ~ 0.35, the 


rest-frame reddening vector is parallel to the color evolution 
vector of red sequence, so dust reddening can move a galaxy 
from ^spec ~ 0.3 to «photo ~ 0.4. At the same time, the 
internal reddening will suppress the excess blue emission, 
reducing excess blue light as a discriminator for these galax¬ 
ies. It should also be noted that internal reddening also dims 
the galaxy, and thus tends to increase photometric errors, 
making it even more difficult to distinguish these galaxies 
from the expected template. 


© 0000 RAS, MNRAS 000, 000-000 



















































redMaGiC on DBS SV Data 17 



Figure 13. Distribution of our fiducial S82 redMaGiC galaxy 
sample in u — g and g — r space for galaxies in the photometric 
redshift bin Zphoto S [0.18,0.22]. Black points are galaxies where 
our photometric redshift estimate agrees with the spectroscopic 
estimate within 2 (t, while red points correspond to ^ 5cr redshift 
outliers. 



Figure 14. Top panel: Stacked rest-frame spectrum of outlier 
(red) and non-outlier (black) redMaGiC galaxies for Clump 2 (see 
Figure]^. Bottom panel: Ratio of the outlier to non-outlier 
spectra (black line). The dashed blue line shows the effects of 
internal dust reddening with E{B—V) = 0.15. The vertical dotted 
lines mark the [Oil] and Ha emission lines, indicating a small 
amount of residual star formation, as with the Clump 1 galaxies. 

8.3 Clump 3: Photo -2 Biased Low 

Finally, we repeat our spectral-stacking procedure for Clump 
3 galaxies (with photo -2 biased low in Figure]^. In Figure fl^ 
we show the difference between the outlier and non-outlier 
spectra (black line). As a comparison, we show the difference 
between outliers and non-outliers for Clump 1 (red dashed 
line), which are similar in that they have 2rm biased low. We 



Figure 15. Difference between the outlier and non-outlier 
stacked rest-frame spectra for Clump 1 (red) and Clump 3 (black) 
galaxies (see Figure]^. The vertical dotted lines mark the [Oil] 
(left-most line) and Ha (right-most line) emission lines. Clump 3 
galaxies are qualtitatively similar to those in Clump 1, with resid¬ 
ual star formation that is not large enough to drive the galaxy 
from the photometric red sequence at SDSS depths. 


see that the differences are qualitatively similar, but that the 
Clump 3 galaxies have excess emission that is significantly 
larger than that of Clump 1. This makes sense, as the SDSS 
DR8 imaging is relatively shallow, and therefore the small 
photometric errors for Clump 1 galaxies make the redMaGiC 
selection more efficient. In contrast, at higher redshifts, the 
larger photometric errors allow for a larger excess emission. 

Having identified the physical origin of the various out¬ 
lier populations of redMaGiC galaxies, it may be possible to 
construct observables that allow us to reject such galaxies 
from the redMaPPer sample. We leave an exploration of this 
possibility to future work. Of course, it may be possible that 
some of the outlier populations cannot be removed with the 
available photometry. For instance, we expect Clump 1 out¬ 
liers in the DES will be difficult to remove without w-band. If 
these outlier populations are irreducible, then they must be 
adequately characterized and the corresponding P{z) distri¬ 
butions for the redMaGiC galaxies must be correspondingly 
updated. Alternatively, the corresponding redshift regions 
ought to be excluded from high precision LSS studies. 


9 SUMMARY AND CONCLUSIONS 

Photometric redshift systematics are the primary challenge 
that must be overcome for pursuing LSS studies with photo¬ 
metric data sets. Based on the fact that red sequence galax¬ 
ies tend to have excellent photometric redshifts, we have 
sought to address this challenge by refining red sequence 
selection algorithms in the hope of creating a “gold” photo¬ 
metric galaxy sample for photometric LSS studies. A com¬ 
plementary goal is to develop a new photometric redshift 
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estimator for these galaxies. The result is the redMaGiC 
algorithm. 

Conceptually, the algorithm is exceedingly simple; one 
specifies a desired comoving space density and luminos¬ 
ity threshold. The algorithm then fits all galaxies with a 
red sequence template and assigns the galaxies a redshift. 
Based on these redshifts, we apply the desired luminosity 
threshold. Finally, we then keep rank-order galaxies by the 
goodness-of-fit statistic , and keep the top N galaxies that 
lead to the desired comoving space density. In practice, the 
algorithm is necessarily more difficult to implement due to 
coupling of the photometric redshift estimates to the galaxy 
density via the photometric redshift afterburner, but the 
above description captures the spirit of the algorithm well. 

As shown in Section we find that redMaGiC is in¬ 
deed successful at identifying red sequence galaxies, and that 
the corresponding photometric redshift estimates are of very 
high quality, with a low bias (< 0.5%), low scatter < 1.6%, 
and low rate of catastrophic outliers sC 2%, with the exact 
values depending on the precise sample under consideration. 
As demonstrated in Section]^ the redMaGiC selection yields 
galaxies with superior photo -2 performance to the standard 
color-cut selection method used to define the SDSS CMASS 
sample. In addition, the photo -2 scatter is correctly esti¬ 
mated a priori. As detailed in Section]^ this performance is 
comparable to the best machine learning photo -2 algorithms 
available today when the same input data is used. Machine 
learning algorithms can improve upon the photo -2 perfor¬ 
mance of redMaGiC if additional information is provided, 
though the improvement remains modest. 

There are, however, two critical advantages of red¬ 
MaGiC photo- 2 s relative to machine learning based algo¬ 
rithms. The first is that redMaGiC has minimal spectro¬ 
scopic requirements: it is much easier to get the necessary 
cluster redshifts that enable the redMaGiC algorithm than it 
is to acquire representative training samples for redMaGiC. 
The second important difference is that, in the absence 
of representative spectroscopic sampling, machine learning 
based algorithm are expected to be biased for galaxies that 
fall outside the training data set, especially at the faint 
end as demonstrated in Figure This failure mode is non¬ 
existent for redMaGiC. 

Of course, should representative spectroscopic training 
sets become available for redMaGiC galaxies in the future, 
one should pursue machine learning techniques to improve 
redMaGiC photo- 2 S. Even with the context of redMaGiC, 
we explicitly demonstrated that representative spectroscopic 
sampling of redMaGiC galaxies enables photo -2 estimation 
that is unbiased at the 0.1% level, and with extremely well 
characterized photo -2 errors (Figure [l0{ right panel). 

Despite all of these successes, some additional work 
clearly remains. First, the current photometric redshifts 
must be extended into P{z) distributions to properly cap¬ 
ture skewness and kurtosis where it exists, for instance near 
filter transitions. Perhaps more importantly, however, the 
current samples clearly exhibit three distinct classes of red¬ 
shift outliers. We have been able to identify the phyical ori¬ 
gin of these outliers — Clumps 1 and 3 in Figure are 
ellipticals or SO galaxies with residual star formation, while 


Clump 2 galaxies are very dusty {E{B — F) « 0.15) ellip- 
tical/SO galaxies. These dusty galaxies also exhibit residual 
star formation, but the primary reason they are outliers is 
their high dust content. We defer the question of whether 
it is possible to photometrically identify these outliers and 
remove them from the redMaGiC sample to future work. 
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APPENDIX A: STAR-GALAXY SEPARATION 


To perform star/galaxy separation, we use object size es¬ 
timates from the ngmix multi-epoch shape fitting cata¬ 
log (Jarvis et al, in prep). The ngmix algorithm fits an ex¬ 
ponential disc profile to each object (in all individual obser¬ 
vations of each griz band), and estimates an intrinsic (psf- 
deconvolved) size (exp_t), as well as an error on that size 
(exp_t_err). Figure Al shows a distribution of object sizes 
as a function of magnitude in the SPTE footprint. The stel¬ 
lar locus at zero-size is obviously separated from the galaxy 
locus at the bright end. At the faint end, where the intrin¬ 
sic size of the galaxies is close to the typical seeing, it is 
harder to distinguish between the two loci. Our goal here 
is to select as complete a galaxy sample as possible while 
minimizing stellar contamination. Our task is made a lit- 
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Figure Al. Intrinsic object size, exp_t, as a function of rrii (as 
estimated with MAG_AUT0. At the bright end, the stars are clearly 
separated from the galaxies, while the confusion is apparent at 
rui ~ 23. The magnitude of the galaxies in the redMaGiC sample 
described here, with z < 0.8 and L/L* > 0.5, is shown with a 
dashed black line. 


tie easier by the fact that we are limiting ourselves to red 
galaxies with 2 Si 0.8 and L/L* > 0.5, the magnitude limit 
of which is denoted with a dashed red line in the figure. 

Until we develop a full probabilistic star/galaxy separa¬ 
tor from the ngmix size estimator, we have decided to make 
use of simple cuts based on the intrinsic size and error on 
the size. At the bright end, we see that true stars do not 
have intrinsic size exp_t > 0.002. At the faint end, we wish 
to make a selection that has as high a galaxy completeness 
as possible, minimizing stellar contamination. We make the 
ansatz that such a cut will take the form 

exp_t + n X exp_t_err > 0.002, (Al) 

where n is some number to be determined, and we expect 
n ~ 2. That is, we keep all objects that are consistent with 
being extended sources within observational errors. 

In order to choose a value of n, we have decided to make 
use of cross-correlation tests. Specifically, stars and galaxies 
should be uncorrelated with each others. Consequently, a 
non-zero cross correlation between a galaxy sample and a 
known stellar sample is indicative of stellar contamination 
in the galaxy sample. 

Consier a sample of n total objects that contains rig 
galaxies and n* stars. One has then 

n = h 9 (l-b dg)-b h*(l-b 5*), (A2) 

and therefore 

l-b5= + 5 + (A3) 

n n 

Defining the stellar fraction of the sample /* = fit, jfig, we 
arrive at 

5 = (1-/*)59 + /.5*. (A4) 

Now, if we cross-correlate this sample (subscript “obs”) with 


Figure A2. Incompleteness (1 —C, dashed lines) and stellar con¬ 
tamination (/*, solid lines) for four different magnitude bins, as 
a function of the selection parameter n. The fainter galaxies tend 
to have lower completeness and larger stellar contamination. 


a known sample of stars, then we have: 


^obs,S -- y":); (5^(5* Ss^S ^ — f*^ss^ 

(A5) 

where Ss is the fluctuation of a known stellar population, 
and we have assumed (5s = 5*. It follows from this assump¬ 
tion that the cross correlation Wobs.s is proportional to the 
stellar auto-correlation Ws,s- Consequently, we can estimate 
the stellar contamination via 


/* = 


'Ws,S 

U^obs.s 


(A6) 


By computing the above ratio for a galaxy selected using a 
cut n as per equation 1^ we seek to optimize our sample 
selection. To measure the cross-correlations, we make use of 
the TreeCorr code (Jarvis et al. 20041. /* is obtained by 


computing the median value of the above ratio on scales of 
1 to 10 arcmin. 

We can use a similar method to estimate the com¬ 
pleteness associated with our stellar-galaxy separation cut. 
Specifically, consider again equation [A^ For large n, the se¬ 
lected sample should be highly complete. Suppose that at a 
large n, call it rimax, the sample has Al(ninax) objects, and 
a stellar fraction /, (rimax) estimated via cross correlations. 
It follows that the number of galaxies is A’(nmax)/*(nmax). 
At a lower n, the number of galaxies Ai(n)/*(n) will have 
decreased, and the relative completeness is simply 

iV(n)/*(n) 


C{n) = 


(A7) 


N{ 

'^max 'jf* (rimax) 

We set rimax = 5 to define the relative completeness, and 
look for the value of n which results in the best compromise 
between purity and completeness. 

We have implemented the above method with two stel¬ 
lar selections, a bright sample (19.0 < i < 21.5), and a faint 
sample (21.5 < i < 22.5). Figure A2 shows the results for 
the faint sample. Results for the bright sample are difficult 
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to interpret (see below for further details). The solid lines 
in Figure A2 show the incompleteness (1 — C) as a function 
of n for three different magnitude bins. The dashed lines 
show the /* value for the same bins. The faintest galaxies 
in the fiducial redMaGiC sample have i ~ 22, and thus lie 
in-between the red and purple lines. The point /* = 1 — G 
for these two lines is n « —0.5 and n ~ 2.5 respectively. We 
adopt as our hducial cut the mean of the these two values, 
n = 1. From the hgure, we expect « 4% stellar contamina¬ 
tion and 4% galaxy incompleteness. 

Results from the bright stellar reference sample are dif- 
hcult if not impossible to interpret. For instance, the com¬ 
pleteness C estimated as above using the bright sample is 
larger than unity. The estimated stellar fraction using the 
bright stellar reference sample is ~ 10%. The difference be¬ 
tween the bright and faint stellar reference samples suggests 
that the assumption 5s = 5, is in fact incorrect, and that 
a more reasonable model might be 5s ~ k5t, for some k. 
Since all we seek here is an optimal star-galaxy separation 
criterion, we adopt the proposed cut with n = 1 here, and 
leave the problem of a more accurate estimate of the stellar 
contamination for the redMaGiC galaxy sample to future 
work. 

We emphasize the stellar contamination fractions 
quoted above are those relevant for the full galaxy catalog 
given the star-galaxy separation criterion we have adopted. 
The stellar fraction of the redMaGiC catalog is much sup¬ 
pressed, since an object must also have red sequence colors 
in order to make it into the redMaGiC catalog. The only red- 
shift at which the stellar locus crosses the red sequence is 
2 : « 0.7, so we expect « 5% stellar contamination a.t z ~ 0.7, 
but essentially no contamination at other redshifts. 


APPENDIX B: DATA CATALOG FORMATS 


The full redMaGiC SDSS DR8 and DES SV catalogs will 
be available at http://risa.stanford.edu/redmapper/ in 
FITS format, and from the online journal in machine- 
readable formats. A summary of the DR8 catalog is given 
in Table [BT] and the SV catalog is given in Table [B^ Abso¬ 
lute magnitudes in the tables are computed using kcorrect 
v4.2 ( Blanton fc Roweis|2007 1. fc-corrections are applied as¬ 
suming an LRG template, band shifted to z = 0.1. 

The SDSS catalogs will be made publicly available upon 
publication of this article in a journal. We plan to release 
the DES redMaGiC catalogs publicly by January, 2016. See 
the Dark Energy Survey websit^for instructions on how to 
download the catalogs. 


® http:// WWW. darkenergysurvey.org/ 
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Table Bl. redMaGiC SDSS DR8 redMaGiC Catalog Format 


Column 

Name 

Format 

Description 

1 

OBJID 

118 

SDSS DR8 CAS object identifier 

2 

RA 

F12.7 

Right ascension in decimal degrees (J2000) 

3 

DEC 

F12.7 

Declination in decimal degrees (J2000) 

4 

IMAG 

F6.3 

SDSS i CMODEL magnitude (dereddened) 

5 

IMAG_ERR 

F6.3 

error on i CMODEL magnitude 

6 

MODEL_MAG_U 

F6.3 

SDSS u model magnitude (dereddened) 

7 

MODEL_MAGERR_U 

F6.3 

error on u model magnitude 

8 

MODEL_MAG_R 

F6.3 

SDSS g model magnitude (dereddened) 

9 

MODEL_MAGERR_R 

F6.3 

error on g model magnitude 

10 

MODEL_MAG_I 

F6.3 

SDSS r model magnitude (dereddened) 

11 

MODEL_MAGERRJ 

F6.3 

error on r model magnitude 

12 

MODEL_MAG_Z 

F6.3 

SDSS i model magnitude (dereddened) 

13 

MODEL_MAGERR_Z 

F6.3 

error on i model magnitude 

14 

MODEL_MAG_Y 

F6.3 

SDSS 2 : model magnitude (dereddened) 

15 

MODEL_MAGERR_Y 

F6.3 

error on 2 : model magnitude 

16 

MABS_U 

F6.3 

Absolute magnitude in u 

17 

MABS_ERR_U 

F6.3 

Error on absolute magnitude in u 

18 

MABS_G 

F6.3 

Absolute magnitude in g 

19 

MABS_ERR_G 

F6.3 

Error on absolute magnitude in g 

20 

MABS_R 

F6.3 

Absolute magnitude in r 

21 

MABS_ERR_R 

F6.3 

Error on absolute magnitude in r 

22 

MABS-I 

F6.3 

Absolute magnitude in i 

23 

MABS_ERR_I 

F6.3 

Error on absolute magnitude in i 

24 

MABS_Z 

F6.3 

Absolute magnitude in 2 

25 

MABS_ERR_Z 

F6.3 

Error on absolute magnitude in 2 

26 

ILUM 

F6.3 

i band luminosity, units of L* 

26 

ZREDMAGIG 

F6.3 

redMaGiC photometric redshift 

27 

ZREDMAGIG_E 

F6.3 

error on redMaGiC photometric redshift 

28 

CHISQ 

F6.3 

of fit to redMaGiC template 

29 

Z_SPEC 

F8.5 

SDSS spectroscopic redshift (-1.0 if not available) 

redMaGiC DES SV redMaGiC Catalog Format 


Column Name 

Format Description 

1 

COADD_OBJECTJD 

118 

DES SVAl object identifier 

2 

RA 

F12.7 

Right ascension in decimal degrees (J2000) 

3 

DEC 

F12.7 

Declination in decimal degrees (J2000) 

4 

MAG_AUTO_G 

F6.3 

g MAG_AUTO magnitude (SLR corrected) 

5 

MAGERR_AUTO_G 

F6.3 

error on g MAG_AUTO magnitude 

6 

MAG_AUTO_R 

F6.3 

r MAG_AUTO magnitude (SLR corrected) 

7 

MAGERR_AUTO_R 

F6.3 

error on r MAG_AUTO magnitude 

8 

MAG_AUTO_I 

F6.3 

i MAG-AUTO magnitude (SLR corrected) 

9 

MAGERR_AUTO_I 

F6.3 

error on i MAG_AUTO magnitude 

10 

MAG_AUTO_Z 

F6.3 

2 MAG-AUTO magnitude (SLR corrected) 

11 

MAGERR_AUTO_Z 

F6.3 

error on 2 MAG_AUTO magnitude 

12 

MABS_G 

F6.3 

Absolute magnitude in g 

13 

MABS_ERR_G 

F6.3 

Error on absolute magnitude in g 

14 

MABS_R 

F6.3 

Absolute magnitude in r 

15 

MABS_ERR_R 

F6.3 

Error on absolute magnitude in r 

16 

MABS-I 

F6.3 

Absolute magnitude in i 

17 

MABS_ERR_I 

F6.3 

Error on absolute magnitude in i 

18 

MABS_Z 

F6.3 

Absolute magnitude in 2 

19 

MABS_ERR_Z 

F6.3 

Error on absolute magnitude in 2 

20 

ZLUM 

F6.3 

2 band luminosity, units of L* 

21 

ZREDMAGIC 

F6.3 

redMaGiC photometric redshift 

22 

ZREDMAGIC_E 

F6.3 

error on redMaGiC photometric redshift 

23 

CHISQ 

F6.3 

of fit to redMaGiC template 

24 

Z-SPEC 

F8.5 

spectroscopic redshift (-1.0 if not available) 
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